“Exogenous reads within Tissue with Immune Cells”
This is a tool created to process RNAseq samples heavily dominated by human reads for the identification of exogenous sequences.
[insert instructions here]
To run the exoTIC pipeline you will need to install the packages listed below:
install.packages("tidyr")
install.packages("stringr")
install.packages("dplyr")
install.packages("doMC")
install.packages("gbm")
install.packages("purrr")
install.packages("readxl")
install.packages("tibble")
You will also need the following packages which are available through Bioconductor. You will first need to install BiocManager with the following command.
install.packages("BiocManager")
Install the following packages via BiocManager as shown below.
BiocManager::install("limma")
BiocManager::install("edgeR")
BiocManager::install("snm")
BiocManager::install("GenomicDataCommons")
BiocManager::install("decontam")
Input to the pipeline needs to be in the form of a folder containing fastQ files. If you are beginning with CRAM or BAM files please see Starting from CRAM or BAM files.
CRAM and BAM files are compressed versions of mapped sequence data formats. For use of the pipeline, they will need to be converted to the fastQ file format which is the decompressed version. Using SAMtools, CRAM files can be converted to BAM files, and BAM files can be converted to fastQ files.
Install SAMtools in your UNIX environment following these directions.
Convert single CRAM->BAM
samtools view -b -o output_filename.bam input_filename.cram
Sort single BAM file
Convert single sorted BAM->fastQ
Convert CRAM -> BAM in batch
module load samtools
cd /this/is/a/path/crams
for i in *.cram; do
filename=$(basename "$i")
fname="${filename%.*}"
if ! test -f $fname".bam"; then
samtools view -b -o $fname".bam" $i
fi
done
Sort the BAM files in a batch
module load samtools
cd /this/is/a/path/bams
for i in *.bam; do
filename=$(basename "$i")
fname="${filename%.*}"
if ! test -f $fname"_sorted.bam"; then
samtools sort -n $i -o $fname"_sorted.bam"
fi
done
convert BAM -> fastq in batch
module load samtools
cd /fs/ess/PAS1695/generate-inputs/ORIEN-processing/scripts/download_scripts/",cancer,"/sorted_bams/
samtools fastq -@ 8 ",i," \\
paste0("-1 ",s,"_1.fastq.gz \\"),
paste0("-2 ",s,"_2.fastq.gz \\"),
paste0("-0 /dev/null -s /dev/null -n"),
Example: Stacked Bar Plot of Microbe Abundance
(note: I’d like to make an example dataset that I run through the whole process with) (note: will need to change file path once image is on github)
Example: Volcano Plot